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I, Korbin Van Dyke, state that: 



Summary of My Opinions 

1 . I previously submitted a declaration in connection with this proceeding (see Van 
Dyke Declaration dated August 15, 2007), referenced hereinafter as the "initial Van Dyke 
declaration". For brevity, I will not repeat information set forth in the initial Van Dyke 
declaration in this declaration. 

2. In preparation of this declaration I have reviewed U.S. Patent Application Serial 
No. 10/757,866. I have also reviewed U.S. Patent Nos. 5,742,840 and 6,295,599 (respectively 
the '840 and '599 patents) that the 10/757,866 patent application indirectly claims priority to, as 
well as appendices to the '840 and '599 patents (the Terpsichore and Zeus System Architecture 
manuals, respectively, and hereinafter referred to respectively as the Terpsichore and the Zeus 
manuals). I have reviewed the Office Action for the 10/757,866 patent application mailed on 
November 5, 2007, including the paragraph on page 10 that discusses the Response to 
Arguments and particularly the Examiner's conclusion that the priority for the claimed invention 
does not extend to the '840 or the '599 patents, since limitations of claims 9, 18, 33, 34, 40, and 
41 are not supported by the '840 or the '599 patents. My understanding is that the features of the 
claimed invention are taught and supported by complying with the written description 
requirement and the enablement requirement. My understanding of the written description 
requirement is that a patent disclosure must describe the claimed invention in sufficient detail 
that one of ordinary skill in the art can reasonably conclude that the inventor had possession of 
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the claimed invention at the time of filing the patent disclosure. My understanding of the 
enablement requirement is that the patent disclosure must contain sufficient information 
regarding the subject matter of the claims to enable one of ordinary skill in the pertinent art to 
make and use the claimed invention. I further understand that whether the enablement 
requirement is met depends on whether undue experimentation is necessary for one of skill in the 
art to practice the invention in light of the patent disclosure. 

3. Based on my review of the materials identified in paragraph 2 of this declaration, 
it is my opinion that with respect to the following limitations relating to claims 9, 18, 33, 34, 40, 
and 41 (as amended), the disclosures of the '840 patent and the '599 patent each indicate that the 
inventors were in possession of the claimed invention of the 10/757,866 patent application as of 
the August 16, 1995 filing date of the '840 patent and further as of the August 24, 1999 filing 
date of the '599 patent; and further the disclosures of the '840 patent and the '599 patent each 
would have enabled a person of ordinary skill in the art to make and use, without undue 
experimentation, the claimed invention of the 10/757,866 patent application as of the August 16, 
1995 filing date of the '840 patent, and further as of the August 24, 1999 filing date of the '599 
patent. The limitations referred to are: 

{claim 9} "decoding a second single instruction specifying a register 
containing a first plurality of floating-point operands and another register 
containing a second plurality of floating-point operands" 

"multiplying the first plurality of floating-point operands by the 
second plurality of floating-point operands to produce a plurality of products" 

"providing the plurality of products to partitioned fields of a result 
register as a catenated result" 

{ claim 1 8 } "wherein at least some of the instructions further include a group 
floating-point multiply instruction for multiplying floating-point data in the 
programmable processor, the group floating-point multiply instruction capable of 
instructing the programmable processor to perform operations" 

the operations comprising "decoding the group floating-point 
multiply instruction specifying a register containing a first plurality of floating- 
point operands and another register containing a second plurality of floating-point 
operands" 

the operations comprising "multiplying the first plurality of 
floating-point operands by the second plurality of floating-point operands to 
produce a plurality of products" 

the operations comprising "providing the plurality of products to 
partitioned fields of a result register as a catenated result" 

{claim 33 } "wherein each of the first and second operands has a width of 64 
bits" 



2/20 



Supplemental Declaration of Korbin S Van Dyke 



{ claim 34 } "a step of executing a plurality of different group floating-point 
arithmetic operations that arithmetically operate on multiple floating-point 
operands stored in partitioned fields of registers in the register file to produce a 
catenated result that is returned to a register in the register file, wherein the 
catenated result comprises a plurality of individual floating-point results" 

{claim 40} "wherein each of the first and second operands has a width of 64 
bits" 

{ claim 4 1 } "wherein the plurality of instructions further comprises a plurality 
of different group floating-point arithmetic operations that arithmetically operate 
on multiple floating-point operands stored in partitioned fields of registers in the 
register file to produce a catenated result that is returned to a register in the 
register file, wherein the catenated result comprises a plurality of individual 
floating-point results" 

Summary of '840 Analysis: 

4. The disclosure of the '840 patent provides detailed information and description 
that I believe indicates that the inventors were in possession of the aforementioned limitations of 
claim 9 (as amended) of the 10/757,866 patent application, and that I further believe would have 
enabled a person of ordinary skill in the art to make and use the claimed invention without undue 
experimentation. For example, on at least pages 19-21 (describing floating-point data formats), 
24-25 (describing general registers), 29 and 47-48 (describing floating-point arithmetic 
hardware), and 129-131 of the Terpsichore manual (describing details of Group Floating-point 
instructions such as various forms of Group Floating-point Multiply instructions) there are 
detailed descriptions of the aforementioned claim elements. 

5. The aforementioned limitations of claim 18 (a computer-readable storage medium 
claim) are substantially similar to the aforementioned limitations of claim 9 (a method claim). 
Further, parent claim context providing antecedent basis for claim 18 (specifically "the execution 
unit") is substantially similar to corresponding parent claim context of claim 9. Thus, for at least 
the reasons described in paragraph 4 of this declaration, the disclosure of the '840 patent 
provides detailed information and description that I believe indicates that the inventors were in 
possession of the aforementioned limitations of claim 18 (as amended) of the 10/757,866 patent 
application, and that I further believe would have enabled a person of ordinary skill in the art to 
make and use the claimed invention without undue experimentation. 

6. The disclosure of the '840 patent provides detailed information and description 
that I believe indicates that the inventors were in possession of the aforementioned limitations of 
claim 33 (as amended) of the 10/757,866 patent application, and that I further believe would 
have enabled a person of ordinary skill in the art to make and use the claimed invention without 
undue experimentation. For example, on at least pages 24-25 (describing general registers), 26 
(generally describing store instructions), and 150-157 of the Terpsichore manual (describing 
details of Store and Store Immediate instructions such as various forms of Store Immediate and 
Store Multiplex Immediate instructions) there are detailed descriptions of the aforementioned 
claim elements. 
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7. The disclosure of the '840 patent provides detailed information and description 
that I believe indicates that the inventors were in possession of the aforementioned limitations of 
claim 34 (as amended) of the 10/757,866 patent application, and that I further believe would 
have enabled a person of ordinary skill in the art to make and use the claimed invention without 
undue experimentation. For example, on at least pages 19-21 (describing floating-point data 
formats), 24-25 (describing general registers), 29 and 47-48 (describing floating-point arithmetic 
hardware), and 129-131 of the Terpsichore manual (describing details of Group Floating-point 
instructions such as various Group Floating-point Add, Divide, and Multiply forms) there are 
detailed descriptions of the aforementioned claim elements. 

8. The aforementioned limitations of claim 40 (a computer-readable storage medium 
claim) are substantially similar to the aforementioned limitations of claim 33 (a method claim). 
Further, parent claim context providing antecedent basis for claim 40 (specifically "the first and 
second operands") is substantially similar to corresponding parent claim context of claim 33. 
Thus, for at least the reasons described in paragraph 6 of this declaration, the disclosure of the 
'840 patent provides detailed information and description that I believe indicates that the 
inventors were in possession of the aforementioned limitations of claim 40 (as amended) of the 
10/757,866 patent application, and that I further believe would have enabled a person of ordinary 
skill in the art to make and use the claimed invention without undue experimentation. 

9. The aforementioned limitations of claim 41 (a computer-readable storage medium 
claim) are substantially similar to the aforementioned limitations of claim 34 (a method claim). 
Further, parent claim context providing antecedent basis for claim 41 (specifically "the execution 
unit") is substantially similar to corresponding parent claim context of claim 34. Thus, for at 
least the reasons described in paragraph 7 of this declaration, the disclosure of the '840 patent 
provides detailed information and description that I believe indicates that the inventors were in 
possession of the aforementioned limitations of claim 41 (as amended) of the 10/757,866 patent 
application, and that I further believe would have enabled a person of ordinary skill in the art to 
make and use the claimed invention without undue experimentation. 

Summary of '599 Analysis: 

10. The disclosure of the '599 patent provides detailed information and description 
that I believe indicates that the inventors were in possession of the aforementioned limitations of 
claim 9 (as amended) of the 10/757,866 patent application, and that I further believe would have 
enabled a person of ordinary skill in the art to make and use the claimed invention without undue 
experimentation. For example, on at least pages 14-16 (describing floating-point data formats), 
19-20 (describing general registers), 23-24 and 55 (describing floating-point arithmetic 
hardware), and 258-260 of the Zeus manual (describing details of Ensemble Floating-point 
instructions such as various forms of Ensemble Multiply Floating-point instructions) there are 
detailed descriptions of the aforementioned claim elements. Note that in the Zeus manual, group 
floating-point instructions are termed "ensemble" floating-point instructions. 

1 1 . The aforementioned limitations of claim 1 8 are substantially similar to the 
aforementioned limitations of claim 9. Thus, for at least the reasons described in paragraph 10 of 
this declaration, the disclosure of the '599 patent provides detailed information and description 
that I believe indicates that the inventors were in possession of the aforementioned limitations of 
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claim 18 (as amended) of the 10/757,866 patent application, and that I further believe would 
have enabled a person of ordinary skill in the art to make and use the claimed invention without 
undue experimentation. 

12. The disclosure of the '599 patent provides detailed information and description 
that I believe indicates that the inventors were in possession of the aforementioned limitations of 
claim 33 (as amended) of the 10/757,866 patent application, and that I further believe would 
have enabled a person of ordinary skill in the art to make and use the claimed invention without 
undue experimentation. For example, on at least pages 19-20 (describing general registers), 21 
(generally describing store instructions), 123-125, and 128-130 of the Zeus manual (describing 
details of Store and Store Immediate instructions, including Store Multiplex and Store Multiplex 
Immediate forms) there are detailed descriptions of the aforementioned claim elements. 

1 3 . The disclosure of the '599 patent provides detailed information and description 
that I believe indicates that the inventors were in possession of the aforementioned limitations of 
claim 34 (as amended) of the 10/757,866 patent application, and that I further believe would 
have enabled a person of ordinary skill in the art to make and use the claimed invention without 
undue experimentation. For example, on at least pages 14-16 (describing floating-point data 
formats), 19-20 (describing general registers), 23-24 and 55 (describing floating-point arithmetic 
hardware), and 258-260 of the Zeus manual (describing details of Ensemble Floating-point 
instructions, such as various Ensemble Multiply, Add, and Divide forms) there are detailed 
descriptions of the aforementioned claim elements. 

14. The aforementioned limitations of claim 40 are substantially similar to the 
aforementioned limitations of claim 33. Thus, for at least the reasons described in paragraph 12 
of this declaration, the disclosure of the '599 patent provides detailed information and 
description that I believe indicates that the inventors were in possession of the aforementioned 
limitations of claim 40 (as amended) of the 10/757,866 patent application, and that I further 
believe would have enabled a person of ordinary skill in the art to make and use the claimed 
invention without undue experimentation. 

15. The aforementioned limitations of claim 41 are substantially similar to the 
aforementioned limitations of claim 34. Thus, for at least the reasons described in paragraph 13 
of this declaration, the disclosure of the '599 patent provides detailed information and 
description that I believe indicates that the inventors were in possession of the aforementioned 
limitations of claim 41 (as amended) of the 10/757,866 patent application, and that I further 
believe would have enabled a person of ordinary skill in the art to make and use the claimed 
invention without undue experimentation 

16. A detailed explanation of the basis for my opinions is set forth in the remainder of 
this declaration. 
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Detailed Basis for My Opinions 

Analysis of the disclosures of the '840 and the '599 patents: 

17. For brevity, the following analysis focuses on and provides details relating to the 
'840 patent, while reciting summary information pointing out where similar descriptive 
information is provided in the '599 patent. 

18. As discussed in paragraph 20 of the initial Van Dyke declaration, the '840 patent 
describes structure of a general purpose, programmable media processor (including, for example, 
a register file and an execution unit), and the '840 patent recites that an instruction set for the 
general purpose media processor is described by the Microfiche Appendix (referred to herein as 
the Terpsichore manual). In addition to elements discussed in paragraph 20 of the initial Van 
Dyke declaration, the '840 patent also describes that a unified stream of media data is processed 
by storage into the register file 110, and multi-precision arithmetic operations are performed on 
the media data. The operations include Boolean, integer, and floating-point mathematical 
operations (see '840, column 5, lines 47-53). Floating-point addition, subtraction, multiplication, 
division, and square root are supported in hardware ('840, column 15, lines 57-59). Similarly, 
the '599 patent describes structure of a general purpose, programmable processor for broadband 
applications, and the '599 patent includes and refers to a Microfiche Appendix (referred to herein 
as the Zeus manual) that describes, for example, an instruction set for the general purpose 
processor. 

19. The Terpsichore manual describes all of the aforementioned elements of claims 9, 
18, 33, 34, 40, and 41, on at least pages 19-21, 24-26, 29, 47-48, 129-131, and 150-157 (attached 
as Exhibit A). The Zeus manual describes all of the elements of claims 9, 18, 33, 34, 40, and 41, 
on at least pages 14-16, 19-21, 23-24, 55, 123-125, 128-130, and 258-260 (attached as Exhibit 
B). 

Claims 9 and 18 

20. The Terpsichore manual describes all of the aforementioned elements of claim 9 
(and the substantially similar aforementioned elements of claim 18), on at least pages 19-21, 24- 
25, 29, 47-48, and 129-131, describing Group Floating-point instructions such as various forms 
of Group Floating-point Multiply instructions. Paragraphs 21-27 of this declaration discuss 
selected portions of those pages. Paragraphs 28-32 of this declaration discuss how the elements 
of claim 9 (and the substantially similar aforementioned elements of claim 18) are described by 
those pages. The Zeus manual describes all of the elements of claim 9 (and the substantially 
similar aforementioned elements of claim 18), on at least pages 14-16, 19-20, 23-24, 55, and 
258-260. 
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21 . The '840 patent describes various floating-point data sizes such as 16, 32, 64, and 
128 bits (see '840, column 15, lines 62-65, and Fig. 9b, reproduced below). The Terpsichore 
manual, for example on pages 19-21, describes the various floating-point data sizes as designed 
to satisfy ANSI/IEEE standard 754-1985. Similarly, the Zeus manual, for example on pages 14- 
16, provides similar information. 



31 30 



half 

15 14 10 9 

| sign [exponent! significand" 
1 5 10 

ps pp single 



I sign | exponent | significand 



co e o co « double 

63 62 52 51 



I sign I exponent | significand 

1 11 52 



112 in q uad 



I sign | exponent | significand 

1 15 112 



22. The ' 840 patent provides description relating to floating-point hardware 
capabilities, such as in the Terpsichore manual on page 29, "operations supported in hardware 
are floating-point add, subtract, multiply, divide, and square root", and further on page 47, 
"partitioning favored for the initial implementation places all instructions that involving shifting 
and shuffling in one execution unit, and all instructions that involve multiplication, including 
fixed-point and floating-point multiply and add in another unit". Similarly, the Zeus manual, for 
example on pages 23-24 and 55, has similar descriptive information. 
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23. The Terpsichore manual describes several variations of Group Floating-point 
Multiply instructions, such as GF.MUL.16, GF.MUL.32, and GF.MUL.64, (among others) as 
described on pages 129-131, and reproduced below (excerpted and annotations added). 
Similarly, the Zeus manual, for example on pages 258-260, provides similar information. 

Group Fioating-point 

These operations take two values from registers, perform floating-point arithmetic 
on groups of bits in the operands, and place the concatenated results in a register. 



GF.MUL.16 


Group floating-point multiply half 


GF.MUL.16.C 


Group floatinq-poini multiply half ceiling 


GF.MUL.16.F 


Group floatinq-poinf multiply half floor 


GF.MUL.16.N 


Group floating-point multiply half nearest 


GF.MUL.16.T 


Group floating-point multiply half truncate 


GF.MUL.16.X 


Group floating-point multiply half exact 


GF.MUL.32 


Group floating-point multiply single 


GF.MUL.32.C 


Group floating-ooint multiply single ceiling 


GF.MUL32.F 


Group floating-point multiply single floor 


GF.MUL.32,N 


Group floating-point multiply single nearest 


GF.MUL32.T 


Group floating-point multiply single truncate 


GF.MUL.32.X 


Group fioaftng-point multiply single exact 


GF.MUL.64 


Group fioating-point multiply double 



24. The Terpsichore manual, on page 130, describes an instruction format for the 
Group Floating-point instructions (including Multiply forms as well as Add and Divide forms), 
reproduced below. Similarly, the Zeus manual, for example on page 260, provides similar 
information. 



Format 

GF. op. prec. round rc=ra.rb 

31 24 23 18 17 12 11 6 5 0 

I GF.prec j ra } rb | rc } op.round | 

8 6 6 6 6 

The operands of the Group Floating-point instructions (such as Multiply forms) include 'ra', 
'rb', and 'rc'. As described in more detail in paragraphs 26(b)-26(c)ii of this declaration, 
contents of registers specified by the 'ra' and 'r£>' operands are interpreted as respective 
collections of partitioned floating-point operands. The partitioned floating-point operands 
relating to 'ra' and 'rb' are pairwise multiplied together, and the results are concatenated and 
then stored in a register specified by the 'rc' operand. 
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25. The Terpsichore manual, on pages 24-25, describes registers referenced as 
operands of some instructions (such as Group Floating-point instructions, including Multiply 
forms as well as Add and Divide forms), as reproduced below (with annotations illustrating 
examples of an 'ra' operand of '0' and an 'rb' operand of '62'). Similarly the Zeus manual, for 
example on pages 19-20, provides similar information. 

General Registers 

Terpsichore user state includes 64 general registers. All are identical; there is no 
dedicated zero valued register, and there are no dedicated floating-point registers. 

63 . .. 0 



mm® 



The forgoing registers are included in register file 110 of Fig. 7 of the '840 patent. 

26. The Terpsichore manual, on pages 1 30- 1 3 1 , describes a definition of various 
Group Floating-point instructions, including several Multiply forms. The definition is 
reproduced below, with annotations highlighting several elements that are discussed in the 
following sub-paragraphs concerning highlights of what one of ordinary skill in the art would 
understand from the description of the Terpsichore manual. Similarly, the Zeus manual, for 
example on page 260, provides similar information. 
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Defimion 

del GfOupF!oaimgPoir.i(op.prec. round .fa.rb.rc) as 

{a <~ RegReadfja 128) 
b «~ RegRead'ib :28) 
for i «- 0 !C 123-prec by prer. 



il rouno*NONE then 

if isSignallingNaN(ai) I isSigoallingNaN(bi) 
raise FloatingPointException 

endif 

case op of 
F.DIV: 

if bi=0 Jhen 

raise RoatingPointArithmetic 

endif 
others: 
endcase 

endif 

case op of 
GF.ADO: 



endcase 
' case op of 

GF.ADD. GF.MUL. GF.DIV: 
[6] ■*q*prec-i.j<- PackF(prec. ci) 
endcase 
endfnr 
endcase 
case round of 



X: 
N: 
T; 
F: 
C: 

NONE: 
endcase 
if rco then 

raise Reservedlnstruction 

endif 

[7] ->RegWrite(rc. 128, c) 
endcase 
enddef 

(a) The 'op' field of the instruction is decoded to distinguish a Multiply form (MUL) 
from an Add (ADD) or Divide (DIV) form, for example, as highlighted by 
annotation [1]. 

(b) Source operands are read from pairs of registers, as specified by the 'ra'and 'ri?' 
operands, into variables 'a' and l b\ respectively (see annotation [2]), such as 
including reading REG[0] into the least-significant 64 bits of 'a' and REG[1] into 
the most-significant 64 bits of 'a', when 'ra' is 0. 




ci «- ai+bi 



[1] GF.MUL: 



t5 G?DiV 



ci <- ai'bi 



ci <- ai/bi 
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(c) A 'for' construct (see annotation [3]) specifies a number of evaluations of 

elements of the construct according to a 'prec' operand that specifies a (floating- 
point) precision to interpret the operands. 

i. For example, if the 'prec' operand is 16, then the '/or' construct is 
evaluated with eight values for variable T (0, 16, 32, 48, 64, 80, 96, and 
1 12), respectively. For another example, if the 'prec' operand is 64, then 
the construct is evaluated with two values for T (0 and 64), respectively. 

ii. Each evaluation begins by determining a partitioned floating-point value 
from each of the variables 'a' and l V (see annotation [4]) in accordance 
with the 'prec' operand. For example, if the 'prec 1 operand is 16, then for 
the evaluation where T is 0, a first partitioned floating-point value is 
determined from 'a' from the least-significant 16 bits of 'a', or 'a^.o' as 
identified by the expression 'a i+pr ec-i..i in the definition. Further in the 
evaluation where T is 0, a second floating-point value is determined from 
the least 16 bits of '6'. Continuing with the example, for the evaluation 
where T is 16, partitioned floating-point values are determined from the 
next-most-significant 16 bits of 'a' and l b\ such that bits 31 to 16 
determine the floating-point values. Further continuing with the example, 
for the evaluation where V is 112, partitioned floating-point values are 
determined from the most significant 16 bits of 'a' and '»'. For another 
example of construct evaluations, if the 'prec' operand is 64, then the /or' 
construct is evaluated with two values for T (0 and 64). Partitioned 
floating-point values are determined in the evaluation where T is 0 as the 
least-significant 64 bits of 'a' and 'o', and in the evaluation where T is 64 
as the most-significant 64 bits. 

iii. The '/or' construct processing continues by decoding a rounding mode 
and processing accordingly, and then decoding the 'op' field of the 
instruction (see previously discussed annotation [1]). 

iv. In the case of a Multiply form, the '/or' construct processing continues by 
multiplying the partitioned floating-point values from 'a' and 'o' by each 
other (see annotation [5]). The multiplying is in accordance with floating- 
point multiplying. 

v. The '/or' construct processing completes by writing the result of the 
multiplies into appropriate bits of destination variable 'c' (see annotation 
[6]) in accordance with the 'prec' operand. The appropriate bit locations 
of 'c' are identical to the bit locations of 'a' and 'o' that the partitioned 
floating-point values were determined from. For example, if the 'prec' 
operand is 16, then for the evaluation where T is 0, the least-significant 
16 bits (i.e. bits 15 to zero) of 'c' are written, and for the evaluation where 
T is 16, the next-most-significant 16 bits (i.e. bits 31 to 16) of 'c' are 
written. For the evaluation where T is 1 12, the most-significant bits (i.e. 
bits 127 to 1 12) of 'c' are written. 
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vi. Note that each evaluation of the 'for' construct is independent of the other 
evaluations, serving to operate on different unique and non-overlapping 
partitioned fields of the operands. Thus each evaluation is performable in 
parallel with the other evaluations, sequentially with the other evaluations, 
or any combination thereof. 

(d) After completion of the 'for' construct, processing completes by writing 'c' into a 
pair of registers, as specified by the 'rc' operand (see annotation [7]). 

One of ordinary skill in the art would readily understand that the computation of the floating- 
point multiplies would occur in ALU 102 of Fig. 7. 

27. Thus the Group Floating-point Multiply instructions are described in the 
Terpsichore manual (and also the Zeus manual) as interpreting contents of two source registers 
as respective pluralities of floating-point operands that are multiplied together, producing a 
plurality of products as results. The results are concatenated together and stored in a register 
specified by a third operand. 

28. At least the Terpsichore manual pages 19-21, 24-25, 29, 47-48, and 129-131 
describe all elements of claim 9 and substantially similar claim 18 (as amended), as described in 
more detail in paragraphs 29-32 of this declaration. Similarly, at least the Zeus manual pages 14- 
16, 19-20, 23-24, 55, and 258-260 describe all elements of claim 9 and substantially similar 
claim 18 (as amended). 

29. The element (of claim 9) decoding a second single instruction specifying a 
register containing a first plurality of floating-point operands and another register containing a 
second plurality of floating-point operands is described by the Terpsichore manual, for example 
as annotated and discussed by paragraphs 21-27 of this declaration. Each of the Group Floating- 
point Multiply instructions is an exemplary single instruction that specifies registers containing 
respective pluralities of floating-point operands, via, for example, the 'ra', and l rV operands. 
See paragraphs 26(b) and 26(c)ii of this declaration for additional detailed discussion. As 
discussed in paragraph 22 of this declaration, an execution unit is clearly disclosed that is 
operable as claimed in this element of claim 9, as well as the other elements of claim 9 discussed 
in the following paragraphs. 

30. The element multiplying the first plurality of floating-point operands by the 
second plurality of floating-point operands to produce a plurality of products is described by the 
Terpsichore manual, for example as annotated and discussed in paragraphs 21-27 of this 
declaration. Each of the Group Floating-point Multiply instructions is operable to perform a 
floating-point multiply on operands from registers specified by 'ra' and l rb\ See paragraph 
26(c)iv of this declaration for additional detailed discussion. 

3 1 . The element providing the plurality of products to partitioned fields of a result 
register as a catenated result is described by the Terpsichore manual, for example as annotated 
and discussed in paragraphs 21-27 of this declaration. Each of the Group Floating-point 
Multiply instructions is operable to produce products into fields of bits of result registers via, for 
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example, the Vc' operand. See paragraphs 26(c)v and 26(d) of this declaration for additional 
detailed discussion. 

32. Thus every element of claim 9 and substantially similar claim 18 (as amended) 
are described at least by the Terpsichore manual on pages 19-21, 24-25, 29, 47-48, and 129-131. 
In addition, every element of claim 9 and substantially similar claim 18 are also described at 
least by the Zeus manual on pages 14-16 (describing floating-point data formats), pages 19-20 
(describing general registers), pages 23-24 and 55 (describing floating-point arithmetic 
hardware), and pages 258-260 (a definition for Ensemble Floating-point instructions, including 
Multiply forms). 



Claims 33 and 40 

33. The Terpsichore manual describes all of the aforementioned elements of claim 33 
(and the substantially similar aforementioned elements of claim 40), on at least pages 24-26 and 
150-157, describing Store Immediate instructions such as various forms of Store Multiplex 
Immediate instructions. The Zeus manual describes all of the elements of claim 33 (and the 
substantially similar aforementioned elements of claim 40), on at least pages 19-21, 123-125, and 
128-130. 

34. Claim 33 is dependent upon claim 28, and context associated with the element (of 
claim 33) the first and second operands is from claim 28, reproduced below (as amended): 

{ claim 28 } A method for processing data in a programmable processor, the 
method comprising: 

decoding a single instruction for performing a bitwise insert 
operation on data in at least one register in a register file within the programmable 
processor, the bitwise insert operation operating on a first operand and a second 
operand stored in the at least one register in the register file, wherein each bit in 
the second operand is individually selectable as either having a first 
predetermined value or a second predetermined value; and 

for each bit in the first operand, the bitwise insert operation 
inserting the bit into a corresponding bit position in a destination value if a 
corresponding bit in the second operand has the first predetermined value. 

35. As is described in more detail in paragraphs 36-39 of this declaration, the 
Terpsichore manual, on at least pages 24-26 and 150-157, describes all of the limitations of 
claim 28, in addition to claim 33, with respect to several variations of Store and Store Immediate 
instructions, including Store Multiplex and Store Multiplex Immediate forms. The initial Van 
Dyke declaration, in paragraphs 22-28, discusses selected portions of those pages. Similarly, at 
least the Zeus manual pages 19-21, 123-125, and 128-130 describe all elements of claims 28 and 
33. 

36. The element of (claim 28) decoding a single instruction for performing a bitwise 
insert operation on data in at least one register in a register file within the programmable 
processor, the bitwise insert operation operating on a first operand and a second operand stored 
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in the at least one register in the register file, wherein each bit in the second operand is 
individually selectable as either having a first predetermined value or a second predetermined 
value is described by the Terpsichore manual, as annotated and discussed in paragraphs 22-28 of 
the initial Van Dyke declaration. Each of the Store Multiplex Immediate instructions is a single 
instruction, as evidenced at least by the dedicated operation codes "S.MUX.64.B.A.I" and 
"S.MUX.64.L.A.I". Each of the Store Multiplex Immediate instructions is for performing a 
bitwise insert operation, since the combination of bit-wise logical- AND and bit- wise logical OR 
described for computing the store value results in insertion of a bit in place of another bit, e.g. 
"insertion" (see paragraph 27 of the initial Van Dyke declaration). The claimed at least one 
register corresponds to the register pair identified by the ( rb' operand (see paragraph 26 of the 
initial Van Dyke declaration). The claimed first and second operands correspond respectively to 
the odd- and even-numbered registers of the register pair. There are no stated restrictions on 
values for the second operand, and therefore as claimed, each bit in the second operand is 
individually selectable as having either a first or a second predetermined value. 

37. The element (of claim 28) for each bit in the first operand, the bitwise insert 
operation inserting the bit into a corresponding bit position in a destination value if a 
corresponding bit in the second operand has the first predetermined value is described by the 
Terpsichore manual, as annotated and discussed in paragraphs 22-28 of the initial Van Dyke 
declaration. As discussed in paragraph 27 of the initial Van Dyke declaration, a store value is 
determined by bit-wise multiplexing (e.g. "inserting") between a first data input and a second 
data input, based on a control input. The second data input is from memory. The first data input 
is the upper 64 bits of a value identified in the Terpsichore manual as l m', and the control input 
is the lower 64 bits of 'm'. As discussed in paragraph 26 of the initial Van Dyke declaration, the 
upper 64 bits of 'm' are obtained from the odd-numbered register identified by 'rb', 
corresponding to the claimed first operand . Further the lower 64 bits of i yrC are obtained from 
the even-numbered register identified by l rb\ corresponding to the claimed second operand . 

38. Therefore, according to the discussion in paragraphs 36-37 of this declaration, the 
element (of claim 33) wherein each of the first and second operands has a width of 64 bits the 
first operand corresponds to the upper 64 bits of 'm' that are obtained from the odd-numbered 
register identified by 'rb' . The second operand c orresponds to the lower 64 bits of 'm' that are 
obtained from the even-numbered register identified by 'rb' . Thus both the first and the second 
operands are described as having a width of 64 bits. 

39. Thus every element of claim 33 is described at least by the Terpsichore manual on 
pages 24-26 and 150-157. In addition, every element of claim 33 is also described at least by the 
Zeus manual on pages 19-21, 123-125, and 128-130. 

40. Claim 40 is dependent upon claim 35, and context associated with the element (of 
claim 40) the first and second operands is from claim 35, reproduced below (as amended): 

{ claim 35 } A computer-readable storage medium having stored therein a 
plurality of instructions that cause a programmable processor to perform 
operations on data in the programmable processor, the plurality of instructions 
comprising: 
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an instruction that causes the processor to perform a bitwise insert 
operation on data in at least one register in a register file within the programmable 
processor, the bitwise insert operation operating on a first operand and a second 
operand stored in the at least one register in the register file, wherein each bit in 
the second operand is individually selectable as either having a first 
predetermined value or a second predetermined value; and 

wherein for each bit in the first operand, the bitwise insert 
operation inserts the bit into a corresponding bit position in a destination value if a 
corresponding bit in the second operand has the first predetermined value. 

41 . Claim 35 (a computer-readable storage medium claim) is substantially similar to 
claim 28 (a method claim), and as previously discussed, claim 40 is substantially similar to claim 
33. Therefore paragraphs 33-39 in this declaration concerning claim 33 and parent claim 28 are 
also applicable to claim 40 and parent claim 35. Thus every element of claim 40 is described at 
least by the Terpsichore manual on pages 24-26 and 150-157. In addition, every element of 
claim 40 is also described at least by the Zeus manual on pages 19-21, 123-125, and 128-130. 

Claims 34 and 41 

42. The Terpsichore manual describes all of the aforementioned elements of claim 34 
(and the substantially similar aforementioned elements of claim 41), on at least pages 19-21, 24- 
25, 29, 47-48, and 129-131, describing Group Floating-point instructions such as various forms 
of Group Floating-point Add, Divide, and Multiply instructions. Paragraphs 43-50 of this 
declaration discuss selected portions of those pages. Paragraphs 51-52 of this declaration 
describe how the elements of claim 34 (and the substantially similar aforementioned elements of 
claim 41) are described by those pages. The Zeus manual describes all of the elements of claim 
34 (and the substantially similar aforementioned elements of claim 41), on at least pages 14-16, 
19-20, 23-24, 55, and 258-260. 

43. As discussed in more detail in paragraph 21 of this declaration, the '840 and the 
'599 patents describe various floating-point data sizes. 

44. As discussed in more detail in paragraph 22 of this declaration, the '840 and the 
'599 patents describe various floating-point hardware capabilities. 
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45. The Terpsichore manual describes several types and variations of Group Floating- 
point instructions, such as Group Floating-point Add (e.g.e GF.ADD.64), Divide (e.g. 
GF.DIV.32), and Multiply (e.g. GF.MUL.16) instructions, as described on pages 129-131, and 
reproduced below (excerpted and annotations added). Similarly, the Zeus manual, for example 
on pages 258-260, provides similar information. 

Group Floating -point 

These operations take two values from registers, perform floating-point arithmetic 
on groups of bits in the operands, and place the concatenated results in a register. 



Operation codes 




GF.ADD.64 


Group floating-point add double 


GF.ADD.64 .C 


Group floating-point add double ceiling 


GF.ADD.64 .F 


Group floating-point add double floor 


GF.ADD.64 ,N 


Group floating-point add double nearest 


GF.ADD.64 J 


Group floating-point add double truncate 


GF.ADD.64 .X 


Group floating-point add double exact 


GF.DIV.16 


Group floating-point divide half 


GF.DIV.16.C 


Group floating-point divide half ceiling 


GF.D1V.16.F 


Group floating-point divide half floor 


GF.DIV.16.N 


Group floating-point divide half nearest 


GF.DIV.16.T 


Group floating-point divide half truncate 


GF.DIV.16.X 


Group floating-point divide half exact 


GF.DIV.32 


Group floating-point divide single 


GF.D1V.32.C 


Group floating-point divide single ceiling 


GF.DIV.32.F 


Group floating-point divide single floor 


GF.DIV.32.N 


Group floating-point divide single nearest 


GF.DIV.32.T 


Group floating-point divide single truncate 


GF.DIV.32.X 


Group floating-point divide single exact 


GF.DiV.64 


Group floating-point divide double 


GF.DIV.64.C 


Group floating-point divide double ceiling 


GF.DIV.64.F 


Group floating-point divide double floor 


GF.DIV.64.N 


Group floating-point divide double nearest 


GF.DIV.64.T 


Group floating-point divide double truncate 


GF.D1V.64.X 


Group floating-point divide double exact 


"GF.MUL.16 


Group floating-point multiply half 


GF.MUL.16. C 


Group floating-point multiply half ceiling 


GF.MUL. 16.F 


Group floating-point multiply half floor 



46. As discussed in paragraph 24 of this declaration, the '840 and the '599 patents 
describe an instruction format for the Group Floating-point instructions (including Add, Divide, 
and Multiply forms). 
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47. As discussed in paragraph 25 of this declaration, the '840 and the '599 patents 
describe registers referenced as operands of some instructions (such as Group Floating-point 
Add, Divide, and Multiply instructions). 

48. As discussed in paragraph 26 of this declaration, the '840 patent and the '599 
patent describe definitions of various Group Floating-point instructions, including several 
Multiply forms. In addition, Add and Divide forms of Group Floating-point instructions are 
described by the '840 patent and the '599 patent, as evidenced by the following excerpt from 
page 131 of the Terpsichore manual: 

case op of 
GF.ADD: 

ci <- ai+bi 
GF.MUL: 

ci <— ai*bi 
GF.D1V.: 

ci <r- ai/bi 

endcase 

Similarly, the Zeus manual, for example on page 260, provides similar information. 

49. The discussion of Multiply forms of Group Floating-point instructions in sub- 
paragraphs 26(a)-26(d) of this declaration is generally applicable to Add and Divide forms, as 
well. Rather than multiplying the partitioned floating-point values (as discussed in sub- 
paragraph 26(c)iv of this declaration), the floating-point values are added (Add form) or divided 
(Divide form). 

50. Thus the Group Floating-point instructions of Add, Divide, and Multiply forms 
are described in the Terpsichore manual (and also the Zeus manual) as interpreting contents of 
two source registers as respective pluralities of floating-point operands that are added, divided, 
or multiplied together, respectively, producing a plurality of floating-point results that are 
concatenated together and stored in a register specified by a third operand. 

51. The element (of claim 34) a step of executing a plurality of different group 
floating-point arithmetic operations that arithmetically operate on mu ltiple floating-point 
operands stored in partitioned fields of registers in the register file to prod uce a catenated result 
that is returned to a register in the register file, wherein the catetated result comp rises a plurality 
of individual floating-point results is described by the Terpsichore manual, for example as 
annotated and discussed in paragraphs 43-50 of this declaration. The Add, Divide, and Multiply 
forms of Group Floating-point instructions are exemplary instructions that embody arithmetic 
operations that operate on multiple floating-point operands, for example as specified by the 'ra', 
and 'rb' operands. Results of the arithmetic operations are returned, for example, to result 
registers specified by the 're' operand. As discussed in paragraph 22 of this declaration, an 
execution unit is clearly disclosed that is operable as claimed in claim 25. 

52. Thus every element of claim 34 and substantially similar claim 41 (as amended) 
are described at least by the Terpsichore manual on pages 19-21, 24-25, 29, 47-48, and 129-131. 
In addition, every element of claim 34 and substantially similar claim 41 are also described at 
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least by the Zeus manual on pages 14-16 (describing floating-point data formats), pages 19-20 
(describing general registers), pages 23-24 and 55 (describing floating-point arithmetic 
hardware), and pages 258-260 (a definition for Ensemble Floating-point instructions, including 
Add, Divide, and Multiply forms). 

Summary and Closing: 

53. The '840 patent, including the Terpsichore manual, provides sufficient 
information in sufficient detail describing the claimed invention (as amended) of the 10/757,866 
patent application, that one of ordinary skill in the art would reasonably conclude that the 
inventors had possession of the claimed invention at the time of filing the '840 patent. Further, 
the '840 patent, including the Terpsichore manual, provides sufficient information regarding the 
subject matter of the claimed invention (as amended) of the 10/757,866 patent application to 
enable one of ordinary skill in the pertinent art to make and use the claimed invention without 
undue experimentation. In addition, the '599 patent, including the Zeus manual, provides 
sufficient information in sufficient detail describing the claimed invention (as amended) of the 
10/757,866 patent application, that one of ordinary skill in the art would reasonably conclude 
that the inventors had possession of the claimed invention at the time of filing the '599 patent. 
Further, the '599 patent, including the Zeus manual, provides sufficient information regarding 
the subject matter of the claimed invention (as amended) of the 10/757,866 patent application to 
enable one of ordinary skill in the pertinent art to make and use the claimed invention without 
undue experimentation. 

54. Therefore, I believe that each of the '840 patent, including the Terpsichore 
manual, and the '599 patent, including the Zeus manual, provide adequate written description 
and enablement as required by 35 USC § 1 12 for the limitations of claims 9, 18, 33, 34, 40, and 
41 (as amended) of the 10/757,866 patent application, as discussed in paragraph 3 of this 
declaration. 

55. I have had no communication with any of the inventors of the 10/757,866 patent 
application (Craig Hansen and John Moussouris) relating to any material in this declaration. 

56. I have been hired as a consultant in connection with procedures before the United 
States Patent and Trademark Office (USPTO) regarding patents and patent applications assigned 
to Microunity Systems Engineering, Inc., including the media processor patent application. I am 
being compensated for my services at the rate of $325/hour. Other than acting as a consultant in 
connection with procedures before the USPTO, I have no interest or connection with Microunity 
Systems Engineering, Inc. 

57. During my evaluation of the media processor patent application, I have been 
impressed by the thoroughness and overall high-quality of the Terpsichore and Zeus manuals. 
The manuals provide clear and unambiguous descriptions of media processing systems and are 
thorough and well- written. The manuals provide comprehensive descriptions of instructions in 
complete architectural detail. The information in the manuals would have been readily 
understood and easily accessible to software engineers coding the media processing systems, and 
hardware engineers implementing microprocessors for use in the media processing systems, and 
that is exactly what architecture reference manuals should be. This is not surprising, since the 
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'840 patent and the '599 patent each include an architecture manual that is intended to enable 
hardware engineers to do exactly that - design, build, and implement a media processor that 
would include circuitry for the claim limitations set forth in paragraph 3 of this declaration, as 
described in the Terpsichore and the Zeus architecture manuals. 
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58. 1 hereby declare that all statements made herein are of my own knowledge are 
true and that all statements made on information and belief are believed to be true; and further 
that these statements were made with the knowledge that willful false statements and the like so 
made are punishable by fine or imprisonment, or both, under Section 1001 Title 18 of the United 
States Code and that such willful false statements may jeopardize the validity of the application 
or any patent issuing therefrom. 
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Fixed-point 

Terpsichore provides load and store instructions to move data between memory 
and the registers, branch instructions to compare the contents of registers and to 
transfer control from one code address to another, and arithmetic operations to 
perform computation on the contents of registers, returning the result to registers. 

jnad and Store 

The load and store instructions move data between memory and the registers. 
When loading data from memory into a register, values are zero-extended or sign- 
extended to fill the register. When storing data from a register into memory, 
values are truncated on the left to fit the specified memory region. 

Load and store instructions that specify a memory region of more than one byte 
may use either little-endian or big-endian byte ordering: the size and ordering are 
explicidy specified in the instruction. Regions larger than one byte may be either 
aligned to addresses that are an even multiple of the size of the region, or of 
unspecified alignment: alignment checking is also explicitly specified in the 



The load and store instructions are used for fixed-point data as well as floating- 
point and digital signal processing data; Terpsichore has a single bank of registers 
for all data types. 

Swap instructions provide multitfuead and multiprocessor synchronization, using 
indivisable operations: add-and-swap, compare-and-swap, and multiplex-and- 
swap. A store-multiplex operation provides the ability to indivisably write to a 
portion of an octlet. These instructions always operate on aligned octlet data, using 
either little-endian or big-endian byte ordering. 

Branch Conditionally 

The fixed-point compare-and-branch instructions provide all arithmetic rests for 
equality and inequality of signed and unsigned fixed-point values. Tests are 
performed either between two operands contained in general registers, or on the 
bitwise and of two operands. Depending on the result of the compare, either a 
branch is taken, or not taken. A taken branch causes an immediate transfer of the 
program counter to the target of the branch, specified by a 12-bit signed offset 
from the location of the branch instruction. A non-taken branch causes no 
transfer; execution continues with the following instruction. 



Other branch instructions provide for unconditional transfer of control to 
addresses too distant to be reached by a 12-bit offset, and to transfer to a target 
while placing the location following the branch into a register. The branch through 
gateway instruction provides a secure means to access code at a higher privilege 
level, in a form similar to a normal procedure call. 
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Arithmetic Op erations 

The operations supported in hardware are floating-point add, subtract, multiply, 
divide, and square root. Other operations required by the ANSI/IEEE floating- 
point standard are provided by software libraries. 

The operations explicitly specify the precision of the operation, and round the 
result to the specified precision at the conclusion of each operation. 

A single instruction provides a floating-point multiply with the result fed into a 
floating-point add. The result is computed as if the multiply is performed to 
infinite precision, added as if in infinite precision, then rounded. This operation is 
a particularly good match to the needs of vector linear algebra routines. 

Hounding 

Rounding is specified within the instructions explicitly, to avoid maintaining 
explicit state for a rounding mode. 

Exceptions 

All the mandated floating-point exception conditions cause a trap when they 
occur- maintenance of sticky and other status bits may be performed using 
software routines. Because the floating-point inexact exception may be very 
frequent, this exception only occurs when specified in the instruction explicit y. 
Arithmetic operations may also specify that all exceptions are to be handled by 
default, generating special results instead of traps. 

Digital Signal Processing 

The Terpsichore processor provides a set of operations that maintain the fullest 
possible use of 64- and 128-bit data paths whw operating on lower-precision 
fixed-point or floating-point vector values. These operations are useful for several 
application areas including digital signal processing, image processing, and 
synthetic graphics. The basic goal of these operations is to accelerate the 
performance of algorithms that exhibit the following characteristics: 

Low-prec isinn arithmetic 

The operands and intermediate results are fixed-point values represented in no 
greater than 64 bit precision. For floating-point arithmetic, operands and 
intermediate results are of 16, 32, or 64 bit precision. 

The use of fixed-point arithmetic permits various forms of operation reordering 
that are not permitted in floating-point arithmetic. Specifically, commatativity and 
associativity, and distribution identities can be used to reorder operauons. 
Compilers can evaluate operations to determine what intermediate precision is 
required to get the specified arithmetic result. 
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branch predicts that a future execution of the same branch will be taken. More 
elaborate prediction may cache the source and target addresses of multiple 
branches, both conditional and unconditional, and both forward and reverse. 

The hardware prediction mechanism is tuned for optimizing conditional branches 
that close loops or express frequent alternatives, and will generally require 
substantially more cycles when executing conditional branches whose outcome is 
not predominately taken or not-taken. For such cases of unpredictable conditional 
results, the use of code which avoids conditional branches in favor of the use of set 
on compare and multiplex instructions may result in greater performance. 

Where the above technique may not be applicable, a Euterpe pipeline may ensure 
that conditional branches which have a small positive offset be handled as if the 
branch is always predicted to be not taken, with the recovery of a misprediction 
causing cancellation of the instructions which have already been issued but not 
completed which ( would be skipped over by the taken conditional branch. This 
"conditional-skip" optimization is performed by the Euterpe implementation and 
requires no specific architectural feature to access or implement. 

A Euterpe pipeline may also perform "branch-return' optimization, in which a 
branch-and-link instruction saves a branch target address which is used to predict 
the target of the next branch -register instruction. This optimization may be 
implemented with a depth of one (only one return address kept), or as a stack of 
finite depth, where a branch and link pushes onto the stack, and a branch-register 
pops from the stack. This optimization can eliminate the misprediction cost of 
simple procedure calls, as the calling branch is susceptible to hardware 
prediction, and the returning branch is predictable by the branch-return 
optimization. Like the conditional-skip optimization described above, this feature is 
performed by the Euterpe implementation and requires no specific architectural 
feature to access or implement. 

Additional Load and Execute Resources 

Studies of the dynamic distribution of Euterpe instructions on various benchmark 
suites indicate that the most frequently-issued instruction classes are load 
instructions and execute instructions. In a high-performance Euterpe 
implementation, it is advantageous to consider execution pipelines in which tht 
ability to target the machine resources toward issuing load and execute 
instructions is increased. 

One of the means to increase the ability to issue execute-class instructions is to 
provide the means to issue two execute instructions in a single-issue string. The 
execution unit actually requires several distinct resources, so by partitioning these 
resources, the issue capability can be increased without increasing the number of 
functional units, other than the increased register file read and write ports. The 
partitioning favored for the initial implementation places all instructions that 
involve shifting and shuffling in one execution unit, and all instructions that 
involve multiplication, including fixed-point and floating-point multiply and add in 
another unit. Resources used for implementing add, subtract, and bitwise logical 
operations may be duplicated, being modest in size compared to the shift and 



m\ 
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multiply units, or shared between the two units, as the operations have low- 
enough latency that two operations might be pipelined within a single issue cycle. 
These instructions must generally be independent, except perhaps that two 
simple add, subtract, or bitwise logical may be performed dependently, if the 
resources for executing simple instructions are shared between the execution 
units. 

One of the means to increase the ability to issue load-class instructions is to 
provide the means to issue two load instructions in a single-issue string. This 
would generally increase the resources required of the data fetch unit and the 
data cache, but a compensating solution is to steal the resources for the store 
instruction to execute the second load instruction. Thus, a single-issue string can 
then contain either two load instructions, or one load instruction and one store 
instruction, which uses the same register read ports and address computation 
resources as the basic 5-instruction string. This capability also may be employed to 
provide support for unaligned load and store instructions, where a single-issue 
string may contain as an alternative a single unaligned load or store instruction 
which uses the resources of the two load-class units in concert to accomplish the 
unaligned memory operation. 

Result 'Forwarding 

When temporally adjacent instructions ate executed by sepetate resources, the 
results of the first instrucuon must generally be forwarded directly to the resource 
used to execute the second instruction, where the result replaces a value which 
may have been fetched from a register file. Such forwarding paths use significant 
resources. A Terpsichore implementation must generally provide forwarding 
resources so that dependencies from earlier instructions within a string are 
immediately forwarded to later instructions, except between a first and second 
execution instruction as described above. In addition, when forwarding results 
from the execution units back to the data fetch unit, additional delay may be 
incurred. 



-48- 



microunity 



Supplementary Declaration of Korbin S Van Dyke -- Exhibit A 



Terpsichore System Architecture 



Group Fioatina - point 

These operations take two values from registers, perf< 
on groups of bits in the operands, and place the 



Operation codes 



floating-point arithmetic 
:ted results in a register. 



GF.ADD.1o 


Group floating-point add half 


GF.ADD.16.C 


Group floating-point add half ceiling 


GF.ADD.16.F 


Group floating-point add half floor 


GF.ADD.16.N 


Group floating-point add half nearest 


GF.ADD.16.T 


Group floating-point add half truncate 


GF.ADD.16.X 


Group floating-point add half exact 


GF.ADD.32 


Group floating-point add single 


GF.ADD.32.C 


Group floating-point add single ceiling 


GF.AD0.32.F 


Group floating-point add single floor 


GF.ADD.32.N 


Group floating-point add single nearest 


GF.ADD.32.T 


Group floating-point add single truncate 


GF.ADD.32.X 


Group floating-point add single exact 


GF.ADD.64 


Group floating-point add double 


GF.ADD.64 C 


Group floating-point add double ceiling 


GF.ADD.64 .F 


Group floating-point add double floor 


GF.ADD.64 .N 


Group floating-point add double nearest 


GF.ADD.64 .T 


Group floating-point add double truncate 


nsc Ann y 


Group floating-point add double exact 


GF.DIV. 16 


Group floating-point divide haft 


GF.DIV.16.C 


Group floating-point divide half ceiling 


Ar Pll\/ 1 R P 
ur.UlV. lo.r 




GF.DIV.16.N 


~6roup 'floating-point divide half nearest 


GF.DIV.16.T 


Group floating-point divide half truncate 


GF.DIV.16.X 


Group floating-point divide half exact 


GF.DIV.32 


Group floating-point divide single 


GF.DIV.32.C 


Group floating-point divide single ceiling 


GF.DIV.32. F 


Group floating-point divide single floor 


GF.DIV.32.N 


Group floating-point divide single nearest 


GF.DIV.32.T 


Group floating-point divide single truncate 


GF.DIV.32.X 


Group floating-point divide single exact 


GF.DIV.64 


Group floating-point divide double 


GF.DIV.64.C 


Group floating-point divide double ceiling > 


GF.DIV.64.F 


Group floating-point divide double floor 


GF.DIV.64.N 


Group floating-point divide double nearest 


GF.DIV.64.T 


Group floating-point divide double truncate 


GF.DIV.64.X 


Group floating-point divide double exact 


GF.MUL.16 


Group floating-point multiply half 


GF.MUL.16.C 


Group floating-point multiply half ceiling 


GF.MUL.16.F 


Group floating-point multiply half floor 
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GF.MUL.16.N 


Group floating-point multiply half nearest 


or.MUL. l b. 1 




GF.MUL.16.X 


Group floating-point multiply half Gxact 


GF.MUL.32 


Group floating-point multiply single 


GF.MUL.32.C 


Group floating-point multiply single ceiling 


GF.MUL.32. F 


Group floating-point multiply single floor 


GF.MUL.32.N 


Group floating-point multiply single nearest 


GF.MUL.32.T 


Group floating-point multiply single truncate 


GF.MUL.32. X 


Group floating-point multiply single exact 


GF.MUL.64 


Group floating-point multiply double 


GF.MUL.64.C 


Group floating-point multiply double ceiling 


GF.MUL.64.F 


Group floating-point multiply double floor 


GF.MUL.64.N 


Group floating-point multiply double nearest 


GF.MUL.64.T 


Group floating-point multiply double truncate 


GF.MUL.64.X 


Group floating-point multiply double exact 





op 


prec 


round/trap 


add 


ADD 


16 32 64 128 


none C F N T X 


divide 


DIV 


16 32 64 128 


none C F N T X 


multiply 


MUL 


16 32 64 128 


none C F N T X 



F. op. prec. round rc=ra,rb 

3J 24 23 18 1 

| GF.prec \ ra ~J 



2 11 6 5 0 

| rc | op.round | 



Description 

The contents of registers ra and rb are combined using the specified floating-point 
operation. The result is placed in register rc. The operation is rounded using the 
specified rounding option or using round-to-nearest if not specified. If a rounding 
option is specified, the operation raises a floating-point exception if a floating-point 
invalid operation, divide by zero, overflow, or underflow occurs, or when specified, 
if the result is inexact. If a rounding option is not specified, floating-point 
exceptions are nor raised, and are handled according to the default rules of IEEE 
754. 

Definition 

del GroupFloalingPoint(op.prec.round.ra.rb.rc) as 
a <- RegRead(ra. 128) 
b «- RegRead(rt) 128) 
(or i «— 0 to 128-prec by prec 

a. <- F(preo.a ltpre c.i .,) 

bi <- F(pfsc.bi +pfec .i .,) 

il round*NONE then 
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Store 

These operations add the contents of two registers to produce a virtual address, 
and store the contents of a register into memory. 

O peration codes 



S.8 4 6 


Store byte _Ij 


S.16.B 


Store double big-endian 


S.16.B.A 


Store double big-endian aligned 


S.16L 


Store double little-endian 


S.16.L.A 


Store double little-endian aligned 


S.32.B 


Store quadlet big-endian 


S.32.B.A 


Store quadlet big-endian aligned 


S.32 L 


Store quadlet little-endian 


S.32.L.A 


Store quadlet little-endian aligned 


S.64.B 


Store octlet big-endian 


S.64.B.A 


Store octlet big-endian aligned 


S.64.L 


Store octlet little-endian 


S.64.L.A 


Store octlet little-endian aligned 


S.128.B 


Store hexlet big-endian 


S.128.B.A 


Store hexlet big-endian aligned 


3.128.L 


Store hexlet little-endian 


S.128.L.A 


Store hexlet little-endian aligned 


S.AAS.64.B.A 


Store add-and-swap octlet big-endian aligned 


S.AAS.64.L.A 


Store add-and-swap octlet little-endian aligned 


S.CAS.G4.B.A 


Store compare-and-swap octlet big-endian aligned 


S.CAS.64.L.A 


Store compare-and-swap octlet little-endian aligned 


S.MAS. 64. B.A 


Store multiplex-and-swap octlet big-endian aligned 


S.MAS.64.L.A 


Store multiplex-and-swap octlet little-endian aligned 


S.MUX.64.B.A 


Store multiplex octlet big-endian aligned 


S.MUX.64.L.A 


Store multiplex octlet little-endian aligned 



size 


ordering 


alignment 


8 






16 32 64 128 


L B 




16 32 64 128 


L B 


A 



i,6 S.S need not specify byte ordering, nor need it specify alignment checking, as it 
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Format 

op ra.rb.rc 

31 24 23 18 17 12 11 6 5 0 

\ S.MINOR | ra | rb | rc | op""~~l 

8 6 6 6 6 



Description 

A virtual address is computed from the sum of the contents of register ra and 
register rb. The contents of register rc, treated as the size specified, is stored in 
memory using the specified byte order. 

If alignment is specified, the computed virtual address must be aligned, that is, it 
must be an exact multiple of the size expressed in bytes. If the address is not 
aligned an "access disallowed by virtual address" exception occurs. 

Definition 

def Sto'efop. ra.rb.rc) as 
case op of 
S8, 

S16L. S16LA. S16B. S16BA. 
S32L, S32LA. S32B. S32BA. 
S64L. S64LA, S64B. S64BA. 
S128L. S128LA. S1288. S128BA: 

function «- NONE 
SAAS64BA. SAAS64LA: 

function «- AAS 
SCAS64BA, SCAS64LA: 

function *- CAS 
SMAS64BA. SMAS64LA: 

function <- MAS 
SMUX64BA. SMUX64LA: 
function <- MUX 

endcase 
case op of 
S8: 

size <- 8 
S16L. S16LA. S16B. S16BA: 

size <- 16 
S32L, S32LA. S32B. S32BA: 

size <- 32 
S64L. S64LA S64B. S64BA. 
SAAS64BA. SAAS64LA: 
size <- 64 

SCAS64BA. SCAS64LA. SMAS64BA. SMAS64LA. SMUX64BA. SMUX64LA: 

size <- 64 
S128L. S128LA. S128B. S128BA: 
size *- 1 28 

endcase 
case op of 

S16L. S16LA. S16B. S16SA. 
S32L, G32LA. S32B. S32BA. 



151 - 



microunity 
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Store Immediate 

These operations add the concents of a register to a sign-extended immediate 
value to produce a virtual address, and store the contents of a registering 
memory. 

Operation cexfes 



-§jy 


Store byte immediate 


S.16. B.A.I 


Store double big-endian aligned immediate 




Store double big-endian immediate 


S.1 6. L.A.I 


Store double little-endian aligned immediate 




Store double little-endian immediate 


S 32 B A 1 




S.32.B.I 


Store quadlet big-endian immediate 


S.32.L.A.I . 


Store quadlet little-endian aligned immediate 


S.32.L.I 


Store quadlet little-endian immediate 


S.64. B.A.I 


Store octlet big-endian aligned immediate 


S.64.B.I 


Store octlet big-endian immediate 


S.64. L.A.I 


Store octlet little-endian aligned immediate 


S.64. L.I 


Store octlet little-endian immediate 


S.128.B.A.I 


Store hexlet big-endian aligned immediate 


S.128.B.I 


Store hexlet big-endian immediate 


S.128.L.A.I 


Store hexlet little-endian aligned immediate 


S.128.L.I 


Store hexlet little-endian immediate 


S.AAS.64.B.A.I 


Store add-and-swap octlet big-endian aligned immediate 


S.AAS.64.L.A.I 


Store add-and-swap octlet little-endian aligned immediate 


S.CAS.64.B.A.I 


Store compare-and-swap octlet big-endian aligned immediate 


S.CAS.64.L.A.I 


Store compare-and-swap octlet little-endian aligned immediate 


S.MAS. 64. B.A.I 


Store multiplex-and-swap octlet big-endian aligned immediaie 


S.MAS.64.L.A.I 


Store multiplex-and-swap octlet little-endian aligned immediate 


S.MUX.64. B.A.I 


Store multiplex octlel big-endian aligned immediate 


S.MUX.64.L.A.I 


Store multiplex octlet littls-endian aligned immediate 



size 


ordering 


alignment 


8 






16 32 64 128 


L ' - B 




16 32 64 128 


L B 


A 



•"S.8.1 



ET0 3 




17/20 



Supplementary Declaration of Korbin S Van Dyke - Exhibit A 




18/20 




19/20 



Supplementary Declaration of Korbin S Van Dyke - Exhibit A 




20/20 



Supplementary Declaration of Korbin S Van Dyke - Exhibit B 




1/12 



Supplementary Declaration of Korbin S Van Dyke -- Exhibit B 




2/12 



Supplementary Declaration of Korbin S Van Dyke - Exhibit B 




Supplementary Declaration of Korbin S Van Dyke - Exhibit B 




Supplementary Declaration of Korbin S Van Dyke - Exhibit B 




6/12 



Supplementary Declaration of Korbin S Van Dyke - Exhibit B 




Supplementary Declaration of Korbin S Van Dyke -- Exhibit B 




Supplementary Declaration of Korbin S Van Dyke -- Exhibit B 




9/12 



Supplementary Declaration of Korbin S Van Dyke - Exhibit B 




10/12 



Supplementary Declaration of Korbin S Van Dyke - Exhibit B 




11/12 



